Methods, devices, and systems for constructing intelligent knowledge base

ABSTRACT

The present disclosure includes a method for constructing an intelligent knowledge base. The method comprises: obtaining, via an intelligent-knowledge-base constructing device, a plurality of abstract semantic expressions, wherein each of the plurality of abstract semantic expressions comprises a semantic-lacking element; receiving an initial request message from a user; acquiring, via the intelligent-knowledge-base constructing device, one or more abstract semantic expressions corresponding to the initial request message by performing an abstract semantic recommending process on the initial request message based on the plurality of abstract semantic expressions; extracting, from the initial request message, an element corresponding to the semantic-lacking element of the one or more abstract semantic expressions; filling the extracted element into the semantic-lacking element to obtain one or more specific semantic expressions corresponding to the initial request message; and storing the initial request message and the one or more specific semantic expressions into the intelligent knowledge base.

CROSS-REFERENCE TO RELATED APPLICATION

This application claims the priority and benefit of Chinese PatentApplication No. 201511028179.2, filed on Dec. 31, 2015, Chinese PatentApplication No. 201511030319.X filed on Dec. 31, 2015, Chinese PatentApplication No. 201511030332.5 filed on Dec. 31, 2015, and ChinesePatent Application No. 201511030353.7 filed on Dec. 31, 2015. The abovefour Chinese applications are incorporated herein in their entirety byreference.

TECHNICAL FIELD

The present disclosure relates to a data processing technology field,and more particularly, to methods, devices, and systems for constructingan intelligent knowledge base such as a question-answer knowledge basebased on semantic similarity calculation and/or abstract semanticrecommendation.

BACKGROUND

Knowledge base is also known as intelligent subject domain database orartificial intelligence database. The subject domain database is awell-structured, easy-to-operate, easy-to-use and fully organizedknowledge cluster. In order to answer questions in one (or some) field,the subject domain database is constructed as a collection of relevantknowledge pieces which are stored, organized, managed, and used incomputer storages in one (or some) knowledge expression form. Theknowledge pieces may include theoretical knowledge and factual datarelated to this field, and heuristic knowledge obtained by expertexperiences, such as definitions in this field, theorems, operationrules, common knowledge, etc.

The subject domain database has been widely used. A typical applicationof the subject domain database is an intelligent question-answerknowledge base or an automatic question-answer knowledge base. Theautomatic question-answer knowledge base stores a number of presetquestions and a number of answers corresponding to the preset questions.When a user asks a question (or an initial request message), theautomatic question-answer knowledge base tries to find a match among thepreset questions for the user question. If the match is found, theautomatic question-answer knowledge base feedbacks to the user an answercorresponding to the matched preset question. Therefore, the user mayhave the answer for his/her user question.

However, questions asked by different users have different viewpointsand different expressions. In order to improve accuracy of thequestion-answer knowledge base, a lot of questions have to be manuallyinput into a question database. This process is time and energyconsuming and low efficiency. Also, each question is paired with ananswer and thus the knowledge base needs a huge amount of the storagespace to store paired questions-answers. In addition, due to thelimitation of the storage space and/or the knowledge volume, theknowledge base may not store enough paired questions-answers and may notprovide an answer to a user's question.

Thus, there is a need to develop devices, systems, and methods thatefficiently storing knowledge and construct an intelligent knowledgebase, dynamically generate answers to user queries, and overcome thelimitations of conventional question-answer knowledge base. Devices,systems, and methods disclosed below address the above described needs.

SUMMARY

The present disclosure includes an exemplary device for constructing anintelligent knowledge base. An exemplary device in accordance with thepresent disclosure comprises: an abstract semantic expression obtainingunit to obtain a plurality of abstract semantic expressions from anabstract semantic database, wherein each of the plurality of abstractsemantic expressions comprises a semantic-lacking element; a receivingunit to receive an initial request message from a user; an abstractsemantic recommending module, coupled to the abstract semanticexpression obtaining unit and the receiving unit, to acquire one or moreabstract semantic expressions corresponding to the initial requestmessage by performing an abstract semantic recommending process on theinitial request message based on the plurality of abstract semanticexpressions; a filling unit, coupled to the abstract semanticrecommending module, to extract, from the initial request message, anelement corresponding to the semantic-lacking element of the one or moreabstract semantic expressions, and to fill the extracted element intothe semantic-lacking element to obtain one or more specific semanticexpressions corresponding to the initial request message; and a storingunit to store the initial request message and the one or more specificsemantic expressions into the intelligent knowledge base.

Another exemplary device in accordance with the present disclosurecomprises: a preset knowledge subject obtaining unit to obtain aplurality of preset knowledge subjects from a subject domain database,wherein each preset knowledge subject comprises a standard question andone or more extended questions; a receiving unit to receive an initialrequest message; a calculation unit, coupled to the preset knowledgesubject obtaining unit and the receiving unit, to perform a semanticsimilarity calculation on the initial request message and the pluralityof preset knowledge subjects to obtain a plurality of semanticsimilarity calculation results; a determination unit, coupled to thecalculation unit, to determine whether a largest one of the plurality ofsemantic similarity calculation results is greater than a similaritythreshold value; and a storing unit to, when the largest one of theplurality of semantic similarity calculation results is greater than thesimilarity threshold value, store into the intelligent knowledge basethe initial request message and the standard question and the one ormore extended questions of a preset knowledge subject corresponding tothe largest one of the plurality of semantic similarity calculationresults.

Yet another exemplary device in accordance with the present disclosurecomprises: a preset knowledge subject obtaining unit to obtain aplurality of preset knowledge subjects from a subject domain database,wherein each preset knowledge subject comprises a standard question andone or more extended questions; a receiving unit to receive an initialrequest message; a calculation unit, coupled to the preset knowledgesubject obtaining unit and the receiving unit, to perform a semanticsimilarity calculation on the initial request message and the pluralityof preset knowledge subjects to obtain a plurality of semanticsimilarity calculation results; a determination unit to determinewhether a largest one of the plurality of semantic similaritycalculation results is greater than a similarity threshold value; anabstract semantic expression obtaining unit to obtain a plurality ofabstract semantic expressions from an abstract semantic database,wherein each of the plurality of abstract semantic expressions comprisesa semantic-lacking element; an abstract semantic recommending module,coupled to the abstract semantic expression obtaining unit and thereceiving unit, to obtain one or more abstract semantic expressionscorresponding to the initial request message by performing, when thelargest one of the plurality of semantic similarity calculation resultsis smaller than the similarity threshold value, an abstract semanticrecommending process on the initial request message based on theplurality of abstract semantic expressions; a filling unit, coupled tothe abstract semantic recommending module, to extract from the initialrequest message an element corresponding to the semantic-lacking elementof the one or more abstract semantic expressions, and to fill theextracted element into the semantic-lacking element to obtain one ormore specific semantic expressions corresponding to the initial requestmessage; and a storing unit to, when the largest one of the plurality ofsemantic similarity calculation results is greater than the similaritythreshold value, store into the intelligent knowledge base the initialrequest message and the standard question and the one or more extendedquestions of a preset knowledge subject corresponding to the largest oneof the plurality of semantic similarity calculation results, or/and whenthe largest one of the plurality of semantic similarity calculationresults is smaller than the similarity threshold value, store theinitial request message and the one or more specific semanticexpressions into the intelligent knowledge base.

The present disclosure also includes an exemplary method forconstructing an intelligent knowledge base. An exemplary method inaccordance with the present disclosure comprises: obtaining, via anintelligent-knowledge-base constructing device, a plurality of abstractsemantic expressions, wherein each of the plurality of abstract semanticexpressions comprises a semantic-lacking element; receiving an initialrequest message from a user; acquiring, via theintelligent-knowledge-base constructing device, one or more abstractsemantic expressions corresponding to the initial request message byperforming an abstract semantic recommending process on the initialrequest message based on the plurality of abstract semantic expressions;extracting, from the initial request message, an element correspondingto the semantic-lacking element of the one or more abstract semanticexpressions; filling the extracted element into the semantic-lackingelement to obtain one or more specific semantic expressionscorresponding to the initial request message; and storing the initialrequest message and the one or more specific semantic expressions intothe intelligent knowledge base.

Another exemplary method in accordance with the present disclosurecomprises: obtaining, via an intelligent-knowledge-base constructingdevice, a plurality of preset knowledge subjects from a subject domaindatabase, wherein each preset knowledge subject comprises a standardquestion and one or more extended questions; receiving an initialrequest message; performing, via the intelligent-knowledge-baseconstructing device, a semantic similarity calculation on the initialrequest message and the plurality of preset knowledge subjects to obtaina plurality of semantic similarity calculation results; determiningwhether a largest one of the plurality of semantic similaritycalculation results is greater than a similarity threshold value; andupon determining that the largest one of the plurality of semanticsimilarity calculation results is greater than the similarity thresholdvalue, storing into the intelligent knowledge base the initial requestmessage and the standard question and the one or more extended questionsof a preset knowledge subject corresponding to the largest one of theplurality of semantic similarity calculation results.

Yet another exemplary method in accordance with the present disclosurecomprises: obtaining, via an intelligent-knowledge-base constructingdevice, a plurality of preset knowledge subjects from a subject domaindatabase, wherein each preset knowledge subject comprises a standardquestion and one or more extended questions; receiving an initialrequest message; performing, via the intelligent-knowledge-baseconstructing device, a semantic similarity calculation on the initialrequest message and the plurality of preset knowledge subjects to obtaina plurality of semantic similarity calculation results; determiningwhether a largest one of the plurality of semantic similaritycalculation results is greater than a similarity threshold value;obtaining a plurality of abstract semantic expressions from an abstractsemantic database, wherein each of the plurality of abstract semanticexpressions comprises a semantic-lacking element; obtaining one or moreabstract semantic expressions corresponding to the initial requestmessage by performing, when the largest one of the plurality of semanticsimilarity calculation results is smaller than the similarity thresholdvalue, an abstract semantic recommending process on the initial requestmessage based on the plurality of abstract semantic expressions;extracting, from the initial request message, an element correspondingto the semantic-lacking element of the one or more abstract semanticexpressions; filling the extracted element into the semantic-lackingelement to obtain one or more specific semantic expressionscorresponding to the initial request message; and storing, when thelargest one of the plurality of semantic similarity calculation resultsis greater than the similarity threshold value, into the intelligentknowledge base the initial request message and the standard question andthe one or more extended questions of a preset knowledge subjectcorresponding to the largest one of the plurality of semantic similaritycalculation results, or/and storing when the largest one of theplurality of semantic similarity calculation results is smaller than thesimilarity threshold value, the initial request message and the one ormore specific semantic expressions into the intelligent knowledge base.

It is to be understood that both the foregoing general description andthe following detailed description are exemplary and explanatory onlyand are not restrictive of the invention, as claimed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A-1C each illustrate a block diagram of an exemplary system forconstructing an intelligent knowledge base, according to embodiments ofthe present disclosure.

FIGS. 2A-2B each illustrate a block diagram of an exemplaryintelligent-knowledge-base constructing device of FIG. 1A, according toembodiments of the present disclosure.

FIGS. 2C-2E each illustrate a block diagram of an exemplaryintelligent-knowledge-base constructing device of FIG. 1B or 1C,according to embodiments of the present disclosure.

FIGS. 3A-3B each illustrates a block diagram of an exemplary abstractsemantic recommending module of FIGS. 2C-2E, according to embodiments ofthe present disclosure.

FIG. 4 illustrates a flow chart of an exemplary method for constructingan intelligent knowledge base according to embodiments of the presentdisclosure.

FIG. 5 illustrates a flow chart of another exemplary method forconstructing an intelligent knowledge base according to embodiments ofthe present disclosure.

FIG. 6 illustrates a flow chart of an exemplary embodiment of anabstract semantic recommending process according to embodiments of thepresent disclosure.

FIG. 7 illustrates a flow chart of another exemplary embodiment of anabstract semantic recommending process according to embodiments of thepresent disclosure.

FIG. 8 illustrates a flow chart of yet another exemplary embodiment ofan abstract semantic recommending process according to embodiments ofthe present disclosure.

FIG. 9 illustrates a flow chart of an exemplary embodiment of a processfor extracting from an initial request message an element correspondingto a semantic-lacking element of an abstract semantic expression, andfilling the extracted element to the semantic-lacking element to obtainone or more specific semantic expressions corresponding to the initialrequest message, according to embodiments of the present disclosure.

FIG. 10 illustrates a flow chart of an exemplary method for constructingan intelligent knowledge base based on abstract semantic recommendation,according to embodiments of the present disclosure.

DETAILED DESCRIPTION

Reference will now be made in detail to the exemplary embodiments,examples of which are illustrated in the accompanying drawings. Whereverpossible, the same reference numbers will be used throughout thedrawings to refer to the same or like parts.

Devices, systems, and methods for constructing an intelligent knowledgebase such as a question-answer knowledge base based on semanticsimilarity calculation and/or abstract semantic recommendation,according to embodiments of the present disclosure, are provided hereinto address the above described needs.

FIG. 1A illustrates a block diagram of an exemplary system 100 forconstructing an intelligent knowledge base based on semantic similaritycalculation, consistent with the present disclosure. Exemplary system100 can include, among other things, a human-machine interface 102, anintelligent-knowledge-base constructing device 104, a subject domaindatabase 106, and an intelligent knowledge base 108. Human-machineinterface 102 is coupled, through wired or wireless communication means,to intelligent-knowledge-base constructing device 104, which is furthercoupled to a subject domain database 106 and an intelligent knowledgebase 108 through wired or wireless communication means.

Human-machine interface 102 can be a hardware device such as a computer,a PDA, a cell phone, a laptop, a desktop, or any computing devicerunning one or more computer programs to provide an interface for ahuman user. Through the interface, the user can input an initial requestmessage and interact with intelligent-knowledge-base constructing device104, via text or voice. The initial request message can be text orvoice. In some embodiments, the initial voice request message may beconverted into text.

Intelligent-knowledge-base constructing device 104 can be a hardwaredevice running one or more computer programs to construct an intelligentknowledge base such as a question-answer knowledge base, based onsemantic similarity calculation, abstract semantic recommendation, orother algorithms. For example, in some embodiments, device 104 receivesthe initial request message from human-machine interface 102 andacquires preset knowledge subjects from subject domain database 106.Device 104 then performs a semantic similarity calculation on theinitial request message and the preset knowledge subjects to obtainsemantic similarity calculation results, and stores the initial requestmessage and the preset knowledge subjects into the intelligent knowledgebase 108 based on the semantic similarity calculation results. Furtherdetails will be described below.

Subject domain database 106 is a storage device storing a structuredcollection of records or data of preset knowledge subjects such asbusiness logic in a specific business field, e.g., communication field,finance field, e-government field, e-commerce field, daily life field,intelligent home field, intelligent transportation field, etc. A presetknowledge subject may include a standard question and one or moreextended questions. The one or more extended questions are differentexpression forms of the standard question, but have a same semanticmeaning as the standard question. The one or more extended questions aredifferent expression forms of the standard question, but have a samesemantic meaning as the standard question. In order to make the subjectdomain database can be used in different intelligent knowledge base in asame filed, the preset knowledge subjects may be common knowledge in thedomain field. In some embodiments, the preset knowledge subject not onlyincludes a standard question and one or more extended questions, butalso includes an answer corresponding to the standard question and theone or more extended questions. In some embodiments, the presetknowledge subjects stored in the subject domain database have a textform. In other embodiments, the preset knowledge subjects may be storedin other forms. For example, the standard question and the extendedquestions of the preset knowledge subject are stored in a text form,while the corresponding answers are stored in a voice form, a video formor other multi-media forms. When there are a plurality of presetknowledge subjects, each preset knowledge subject has a correspondingstorage space and a corresponding storage address. In some embodiments,database 106 arranges the memory according to the data structures andrelations stored there to improve the storage efficiency. Furtherdetails will be described below.

Intelligent knowledge base 108 such as a question-answer knowledge baseis a storage device storing a structured collection of records or dataof initial request messages, preset knowledge subjects includingstandard questions and extended questions, and/or specific semanticexpressions. In some embodiments, intelligent knowledge base 108arranges the memory according to the data structures and relationsstored there to improve the storage efficiency. Further details will bedescribed below.

FIG. 1B illustrates a block diagram of another exemplary system 100 forconstructing an intelligent knowledge base based on semantic similaritycalculation and abstract semantic recommendation, consistent with thepresent disclosure. In some embodiments, exemplary system 100 may alsoinclude an abstract semantic database 110, in addition to human-machineinterface 102, intelligent-knowledge-base constructing device 104,subject domain database 106, and an intelligent knowledge base 108, asdescribed above.

Abstract semantic database 110 is a storage device storing a structuredcollection of records or data of abstract semantic expressions each ofwhich includes a semantic-lacking element. An abstract semanticexpression may include not only a semantic-lacking element, but also asemantic rule word. In some embodiments, the semantic rule word ismarked with wordclass information. The wordclass information indicatesthat the semantic rule word belongs to a wordclass. A wordclass includesseveral key words having a same usage and a similar semantic meaning. Insome embodiments, database 106 arranges the memory according to the datastructures and relations stored there to improve the storage efficiency.Further details about database 110 will be described below.

In some embodiments, as shown in FIG. 1B, intelligent-knowledge-baseconstructing device 104 receives an initial request message fromhuman-machine interface 102 and acquires preset knowledge subjects fromsubject domain database 106. Device 104 then performs a semanticsimilarity calculation on the initial request message and the presetknowledge subjects to obtain semantic similarity calculation results.Further, device 104 obtains abstract semantic expressions each includinga semantic-lacking element from abstract semantic database 110. Device104 then performs an abstract semantic recommending process on theinitial request message based on the semantic similarity calculationresults and the abstract semantic expressions to obtain one or moreabstract semantic expressions corresponding to the initial requestmessage. Also, device 104 extracts, from the initial request message, anelement corresponding to the semantic-lacking element of the one or moreabstract semantic expressions. Afterwards, device 104 fills theextracted element into the semantic-lacking element to obtain one ormore specific semantic expressions corresponding to the initial requestmessage. Finally, device 104 stores the initial request message and theone or more specific semantic expressions into intelligent knowledgebase 108. Further details will be described below.

FIG. 1C illustrates a block diagram of another exemplary system 100 forconstructing an intelligent knowledge base based on abstract semanticrecommendation, consistent with the present disclosure. As shown in FIG.1C, in some embodiments, exemplary system 100 may also include anabstract semantic database 110, in addition to human-machine interface102, intelligent-knowledge-base constructing device 104, and anintelligent knowledge base 108, as described above.

In some embodiments, as shown in FIG. 1C, intelligent-knowledge-baseconstructing device 104 receives an initial request message fromhuman-machine interface 102 and obtains abstract semantic expressionseach including a semantic-lacking element from abstract semanticdatabase 110. Device 104 then performs an abstract semantic recommendingprocess on the initial request message based on abstract semanticdatabase 110 to obtain one or more abstract semantic expressionscorresponding to the initial request message. Also, device 104 extracts,from the initial request message, an element corresponding to thesemantic-lacking element of the one or more abstract semanticexpressions. Device 104 then fills the extracted element into thesemantic-lacking element to obtain one or more specific semanticexpressions corresponding to the initial request message. Afterwards,device 104 stores the initial request message and the one or morespecific semantic expressions into intelligent knowledge base 108.Further details will be described below.

The above described database 106, knowledge based 108, and database 110may be implemented using any type of volatile or non-volatile memorydevices, or a combination thereof, such as a static random access memory(SRAM), an electrically erasable programmable read-only memory (EEPROM),an erasable programmable read-only memory (EPROM), a programmableread-only memory (PROM), a read-only memory (ROM), a magnetic memory, aflash memory, or a magnetic or optical disk. Also, the structuredcollection stored therein can be organized as a set of queues, astructured file, a relational database, an object-oriented database, orany other appropriate database. Computer software, such as a databasemanagement system, may be utilized to manage and provide access to thedata stored therein.

It is understood that the devices shown in FIGS. 1A-1C are forillustrative purposes. Certain devices may be removed or combined andother devices may be added. In some embodiments, the above described thedevices in each of FIGS. 1A-1C may be located in one computing device.In other embodiments, they may be located on more than one computingdevices connected via wide area networks (WANs), local area networks(LANs), wireless networks, or any combination thereof.

FIGS. 2A-2B each illustrate a block diagram of an exemplary embodimentof intelligent-knowledge-base constructing device 104 of FIG. 1A,consistent with the present disclosure. Device 104 disclosed in FIGS.2A-2B constructs intelligent knowledge base based on semantic similaritycalculation, consistent with the present disclosure.

In FIG. 2A, intelligent-knowledge-base constructing device 104 mayinclude a preset knowledge subject obtaining unit 201, a receiving unit202, a calculation unit 203, a determination unit 204, and a storingunit 205.

Preset knowledge subject obtaining unit 201 can be a hardware computingdevice running one or more computer programs to obtain preset knowledgesubjects from subject domain database 106. Each of the preset knowledgesubjects includes a standard question and one or more extendedquestions. Unit 201 provides the preset knowledge subjects tocalculation unit 203 for further processing, as described below.

Receiving unit 202 can be a hardware computing device running one ormore computer programs to receive an initial request message from a uservia human-machine interface 102. Unit 202 provides the initial requestmessage to calculation unit 203 for further processing, as describedbelow.

Calculation unit 203 is coupled to preset knowledge subject obtainingunit 201 to receive the preset knowledge subjects and coupled toreceiving unit 202 to receive the initial request message. Calculationunit 203 can be a processor or any computing device to perform asemantic similarity calculation. The semantic similarity refers to amatching degree of words and phrases between the initial request messageand the preset knowledge subjects, and (or) a semantic conformance.Calculation unit 203 performs a semantic similarity calculation on theinitial request message and the preset knowledge subjects, and obtainssemantic similarity calculation results. Calculation unit 203 performsthe semantic similarity calculation between the initial request messageand the standard question, and between the initial request message andeach extended question respectively, and defines the largest one of thecalculation results as a semantic similarity calculation result betweenthe initial request message and the preset knowledge subject. Unit 203provides the semantic similarity calculation results to determinationunit 204 for further processing, as described below.

In some embodiments, calculation unit 203 adopts one or more methods toperform the semantic similarity calculation, such as a calculationmethod based on VSM (Vector Space Model), a calculation method based onLSI (Latent Semantic Indexing) model, a semantic similarity calculationmethod based on attribute theory, or a semantic similarity calculationmethod based on Hamming distance. It should be noted that, thesimilarity calculation method may be other semantic similaritycalculation method.

Determination unit 204 can be a computer program or a hardware computingdevice running one or more computer programs to determine whether thelargest one of the semantic similarity calculation results is greaterthan a similarity threshold value. In some embodiments, the similaritythreshold value ranges from, for example, 0.7 to 1.0. Unit 204 providesthe determination result to storing unit 205 for further processing.

Storing unit 205 can be a hardware computing device running one or morecomputer programs to store data into intelligent knowledge base 108according to the determination result provided by determination unit204. For example, when the largest one of the semantic similaritycalculation results is greater than the similarity threshold value,storing unit 205 stores into intelligent knowledge base 108 the initialrequest message, and the standard question and the one or more extendedquestions of a preset knowledge subject that is corresponding to thelargest one of the semantic similarity calculation results.

In some embodiments, intelligent-knowledge-base constructing device 104may also include an answer providing unit (not shown in FIG. 2A). Theanswer providing unit can be a hardware computing device running one ormore computer programs to provide an answer corresponding to the initialrequest message. The storing unit stores the initial request message andcorresponding answer into intelligent knowledge base 108.

In some embodiments, intelligent-knowledge-base constructing device 104may further include an extracting unit. The extracting unit can be ahardware computing device running one or more computer programs toextract portions of preset knowledge subjects. For example, after thereceiving unit stops receiving initial request messages, the extractingunit extracts at least portions of the preset knowledge subjects thatare not stored in intelligent knowledge base 108. Storing unit 205stores the preset knowledge subjects extracted by the extracting unitinto intelligent knowledge base 108.

Now referring to FIG. 2B, in some embodiments,intelligent-knowledge-base constructing device 104 may further include amessage preprocessing unit 207, in addition to preset knowledge subjectobtaining unit 201, receiving unit 202, calculation unit 203, adetermination unit 204, and storing unit 205, as disclosed in FIG. 2Aand described above.

Message preprocessing unit 207 can be a computer program or a hardwarecomputing device running one or more computer programs to preprocess theinitial request message received from human-machine interface viareceiving unit 202. Unit 207 extracts representative features frommessages to be updated. The representative features are used forsimilarity calculation. In some embodiments, the message preprocessingmay include a word segmentation process and a stop word removal process.Further, the text preprocessing may also include removing some ofmeaningless words, for example, “I”, “want”, “what”, etc., which may notbe helpful in the semantic similarity calculation.

FIGS. 2C-2E each illustrate a block diagram of an exemplary embodimentof intelligent-knowledge-base constructing device 104 of FIG. 1B or 1C,consistent with the present disclosure. Device 104 disclosed in FIGS.2C-2E constructs intelligent knowledge base based on semantic similaritycalculation and/or abstract semantic recommendation, consistent with thepresent disclosure. In some embodiments, device 104 constructsintelligent knowledge base based on abstract semantic recommendation andthus may not include components facilitating semantic similaritycalculation. Accordingly, subject domain database 106, preset knowledgesubject obtaining unit 201, receiving unit 202, calculation unit 203,determination unit 204, and message preprocessing unit 207 areillustrated as dotted blocks and connections between them areillustrated as dotted lines in FIGS. 2C-2E.

In FIG. 2C, intelligent-knowledge-base constructing device 104 mayfurther include an abstract semantic recommending module 208, anabstract semantic expression obtaining unit 209, and a filling unit 210,in addition to those components (i.e., preset knowledge subjectobtaining unit 201, receiving unit 202, calculation unit 203,determination unit 204, and storing unit 205) disclosed in FIG. 2A asdescribed above. Abstract semantic recommending module 208, abstractsemantic expression obtaining unit 209, and filling unit 210 will bedescribed in details below. The detailed description of those componentsalso shown in FIG. 2A is provided above and will not be repeated here.

Abstract semantic expression obtaining unit 209 can be a hardwarecomputing device running one or more computer programs to obtain one ormore abstract semantic expressions from abstract semantic database 110and provide to abstract semantic recommending module 208 for furtherprocessing. Each of the abstract semantic expressions includes asemantic-lacking element.

Abstract semantic recommending module 208 can be a hardware computingdevice running one or more computer programs to perform an abstractsemantic recommending process on the initial request message receivedfrom human-machine interface 102 via receiving unit 202. For example,when determination unit 204 determines that the largest one of thesemantic similarity calculation results is smaller than the similaritythreshold value, module 208 performs an abstract semantic recommendingprocess on the initial request message based on the abstract semanticexpressions received from abstract semantic database 110. Based on therecommendation, module 208 can obtain one or more abstract semanticexpressions corresponding to the initial request message. FIGS. 3A-3Beach illustrates a block diagram of an exemplary abstract semanticrecommending module 208, according to embodiments of the presentdisclosure.

As shown in FIG. 3A, abstract semantic recommending module 208 includesa word segmentation unit 302, a part-of-speech tagging unit 304, awordclass determination unit 306, a searching unit 308, and a matchingunit 310.

Word segmentation unit 302 can be a hardware computing device runningone or more computer programs to perform a word segmentation process onthe initial request message. The segmentation process breaks the messageinto one or more single words. The word segmentation process may use aforward (reverse) maximum matching method, a best matching method, aword by word traversal method, a word frequency statistics method, orother suitable word segmentation methods.

Part-of-speech tagging unit 304 can be a hardware computing devicerunning one or more computer programs to perform a part-of-speechtagging process on each single word, so as to obtain the part-of-speechinformation of each single word. The part-of-speech tagging process iscalled grammatical tagging to mark up a word in a text (corpus) ascorresponding to a particular part (such as a noun, a verb, anadjective, an adverb, etc.) of speech, based on both its definition andits context, e.g., its relationship with adjacent and related words in aphrase, sentence, or paragraph.

Wordclass determination unit 306 can be a hardware computing devicerunning one or more computer programs to perform a wordclassdetermination process on each single word. Based on the determinationprocess, unit 306 can obtain the wordclass information of each singleword.

Searching unit 308 can be a hardware computing device running one ormore computer programs to perform a searching process on the abstractsemantic expressions obtained by the abstract semantic expressionobtaining unit 209 to obtain an abstract semantic candidate set relevantto the initial request message. The abstract semantic candidate setincludes a plurality of abstract semantic expressions.

Matching unit 310 can be a hardware computing device running one or morecomputer programs to perform a matching process on the abstract semanticexpressions in the abstract semantic candidate set based on thepart-of-speech information and the wordclass information. Based on thematching process, unit 310 can obtain an abstract semantic expressioncorresponding to the initial request message.

In some embodiments, the abstract semantic expression includes asemantic rule word. At least parts of semantic rule words of theabstract semantic expressions in the abstract semantic candidate set,obtained by the searching unit, are the same, or belong to a samewordclass, as parts of the single words of the initial request message.

Matching unit 310 determines the abstract semantic expressioncorresponding to the initial request message based on the followingconditions: the part-of-speech corresponding to the semantic-lackingelement includes the part-of-speech of the corresponding fillingelement; single words of the initial request message except the fillingelement are the same or belong to a same wordclass as the abstract rulewords; and the abstract semantic expression has a same order as theinitial request message.

FIG. 3B illustrates a block diagram of another exemplary abstractsemantic recommending module. As shown in FIG. 3B, abstract semanticrecommending module 208 includes a rule-word identification unit 312, inaddition to a word segmentation unit 302, a part-of-speech tagging unit304, a wordclass determination unit 306, a searching unit 308, and amatching unit 310 of FIG. 3B as described above. Rule-wordidentification unit 312 can be a hardware computing device running oneor more computer programs to identify each single word as a semanticrule word or a non-semantic rule word. Part-of-speech tagging unit 304performs grammatical tagging on each non-semantic rule word to obtainpart-of-speech information of the non-semantic rule word. Wordclassdetermination unit 306 performs a wordclass determination process oneach semantic rule word to obtain wordclass information of the semanticrule word.

Referring back to FIG. 2C, filling unit 210 can be a hardware computingdevice running one or more computer programs to fill an element into thesemantic-lacking element of the abstract semantic expressions to obtainspecific semantic expressions corresponding to the initial requestmessage. For example, after the one or more abstract semanticexpressions corresponding to the initial request message are obtained,extracting, from the initial request message, an element correspondingto the semantic-lacking element of the one or more abstract semanticexpressions; filling the extracted element into the semantic-lackingelement to obtain one or more specific semantic expressionscorresponding to the initial request message.

Based on the results of determination unit 204 and filling unit 210,storing unit 205 can store data into intelligent knowledge base 108. Forexample, when the largest one of the plurality of semantic similaritycalculation results is greater than the similarity threshold value,storing unit 205 stores into intelligent knowledge base 108 the initialrequest message, and the standard question and the one or more extendedquestions of a preset knowledge subject corresponding to the largest oneof the semantic similarity calculation results. When the largest one ofthe semantic similarity calculation results is smaller than thesimilarity threshold value, storing unit 205 stores into intelligentknowledge base 108 the initial request message and the one or morespecific semantic expressions.

In some embodiments, device 104 does not need the semantic similarityinformation to construct the intelligent knowledge base. Specifically,abstract semantic recommending module 208 receives an initial requestmessage from human-machine interface 102 via receiving unit 202 andobtains, via abstract semantic expression obtaining unit 209, abstractsemantic expressions each including a semantic-lacking element fromabstract semantic database 110. Module 208 then performs an abstractsemantic recommending process on the initial request message based onabstract semantic database 110 to obtain one or more abstract semanticexpressions corresponding to the initial request message. Filling unit210 extracts, from the initial request message, an element correspondingto the semantic-lacking element of the one or more abstract semanticexpressions. And then, filling unit 210 fills the extracted element intothe semantic-lacking element to obtain one or more specific semanticexpressions corresponding to the initial request message. Afterwards,storing unit 205 stores the initial request message and the one or morespecific semantic expressions into intelligent knowledge base 108.

Now referring to FIG. 2D, in some embodiments,intelligent-knowledge-base constructing device 104 includes abstractsemantic recommending module 208, abstract semantic expression obtainingunit 209, and filling unit 210, in addition to those components (i.e.,preset knowledge subject obtaining unit 201, receiving unit 202, messagepreprocessing unit 207, calculation unit 203, determination unit 204,and storing unit 205) disclosed in FIG. 2B as described above. Thedetailed description for each component in FIG. 2D has been providedabove and will not be repeated here.

With reference to FIG. 2E, intelligent-knowledge-base constructingdevice 104 may further include a scoring unit 211, in addition to thosecomponents disclosed in FIG. 2D as described above. Scoring unit 211 canbe a hardware computing device running one or more computer programs toperform a scoring process on an abstract semantic expressioncorresponding to the initial request message. For example, when a numberM of the abstract semantic expressions corresponding to the initialrequest message is larger than a number N of the specific semanticexpressions, scoring unit 211 performs a scoring process on eachabstract semantic expression corresponding to the initial requestmessage. Filling unit 210 then extracts from the initial request messageelements corresponding to semantic-lacking elements of N abstractsemantic expressions having higher score. Further, filling unit 210fills the extracted elements into corresponding semantic-lackingelements of the N abstract semantic expressions having higher score, soas to obtain N specific semantic expressions corresponding to theinitial request message.

Scoring unit 211 performs a scoring process according to one or moremethods consistent with the present disclosures. For example, scoringunit 211 can set a higher score to the abstract semantic expression ifthe number of the matched semantic-lacking elements is higher. In someembodiments, scoring unit 211 can define a semantic-lacking element ofthe abstract semantic expression as a core semantic element, and set ahigher score to the abstract semantic expression if the semantic-lackingelement is more close to the core semantic element. In some embodiments,scoring unit 211 can set a higher score to the abstract semanticexpression if the confidence value of the part-of-speech is higher. Insome embodiments, scoring unit 211 can set a higher score to theabstract semantic expression if a priority level is higher, wherein thepriority levels is pre-assigned to the abstract semantic expression.Further, in some embodiments, scoring unit 211 can set a higher score tothe abstract semantic expression if a probability is higher, wherein theprobability is determined based on a natural language model, andcorresponds to whether data information obtained by filling segmentationwords of a large amount of crawled corpus data into the abstractsemantic expression has correct semantic information. Scoring unit 211can adopt one or more that above methods to perform the scoring.

The illustrated configuration of intelligent-knowledge-base constructingdevice 104 in each of FIGS. 2A-2E is exemplary only, and persons ofordinary skill in the art will appreciate that the various illustratedelements may be provided as discrete elements or be combined, and beprovided as any combination of hardware and software.

FIG. 4 illustrates a flow chart of an exemplary method forintelligent-knowledge-base constructing device 104 of FIG. 2A or 2Bconstructing an intelligent knowledge base based on semantic similaritycalculation, according to embodiments of the present disclosure. Theflow chart includes steps 402 through 410.

Specifically, at step 402, intelligent-knowledge-base constructingdevice 104 obtains a plurality of preset knowledge subjects from asubject domain database 106 in a specific field. Each preset knowledgesubject includes a standard question and one or more extended questions.It should be noted that, the standard question and the extended questionnot only can be expressed in semantic expression forms, but can also beexpressed in specific question forms. In some embodiments, the presetknowledge subject not only includes a standard question and one or moreextended questions, but also includes an answer corresponding to thestandard question and the one or more extended questions.

For example, regarding how to register a CRBT (Color Ring Back Tone)service in the telecommunication field, the plurality of presetknowledge subjects, which are stored in the subject domain database andare relevant to “how to register a CRBT service,” include: “how toregister a CRBT service,” “what's the step for registering a CRBTservice,” “what should I do if I want to register a CRBT service,” and“how does a CRBT service be registered.” One of the above questions isdefined as the standard question, and the others are defined as theextended questions. In some embodiments, the first question “how toregister a CRBT service” is defined as the standard question, and theother three questions are defined as the corresponding extendedquestions. In other embodiments, other questions may be defined as thestandard question. In some embodiments, the subject domain database mayfurther include answers to the question “how to register a CRBTservice.”

The subject domain database further includes other preset knowledgesubjects, for example, a present knowledge subject on how to register aGPRS service, a preset knowledge subject on how to suspend the mobilephone service, or a present knowledge subject on how to register adiscount service for long distance communication.

At step 404, intelligent-knowledge-base constructing device 104 receivesan initial request message from a user via human-machine interface 102.In some embodiments, the initial request message is a message which isan input from human-machine interface 102. For example, the initialrequest message is a text message which is an input from a keyboard, atouch screen, etc. In some embodiments, the initial request message is avoice message inputted from a microphone and is converted by a speechrecognition engine.

For example, regarding how to register a CRBT service as an example, thereceived initial request message may be “what should I do if I want toregister a CRBT service.” The method proceeds to step 406.

At step 406, intelligent-knowledge-base constructing device 104 performsa semantic similarity calculation on the initial request message and theplurality of preset knowledge subjects of the subject domain database toobtain a plurality of semantic similarity calculation results. Whenthere are a plurality of preset knowledge subjects, the semanticsimilarity calculation is performed on the initial request message andeach preset knowledge subjects successively, so as to obtaincorresponding semantic similarity calculation results.

In some embodiment, when each preset knowledge subject includes astandard question and one or more extended questions, the semanticsimilarity calculation is performed between the initial request messageand the standard question, and between the initial request message andeach extended question respectively. The largest one of the calculationresults is defined as a semantic similarity calculation result betweenthe initial request message and the preset knowledge subject.

In some embodiments, a process of the semantic similarity calculationmay include: performing a similarity calculation on the initial requestmessage and the preset knowledge subject of the subject domain databaseaccording to a similarity calculation method, so as to obtain a firstfeature value corresponding to the initial request message and a secondfeature value corresponding to the preset knowledge subject. Then thefirst feature value and the second feature value is compared orprocessed to obtain a similarity value (or a semantic similaritycalculation result). The comparison or processing may be a comparisonoperation, a subtraction operation or operations in other form. The morethe first feature value is close to the second feature value, the higherthe similarity value between the initial request message and itscorresponding preset knowledge subject is, and vice versa.

It should be noted that, when the semantic similarity calculation isperformed, the first feature value and the second feature value can beobtained in parallel (calculating at the same time), or can be obtainedin serial (calculating successively).

For example, in some embodiments, intelligent-knowledge-baseconstructing device 104 may adopt the calculation method based on VSM toperform a semantic similarity calculation on the initial request messageand the preset knowledge subjects of the subject domain database. Theinitial request message and the preset knowledge subjects in the subjectdomain database may include independent entry groups (T1, T2, . . . ,Tn). A predetermined weight Wi is assigned to each entry Ti (1≦i≦n)based on its importance in a sentence. T1, T2, . . . , Tn are regardedas coordinate axes in an n-dimensional coordinate system, and W1, W2, .. . , Wn is regarded as corresponding coordinate values. Thus, anorthogonal entry vector group obtained by resolving (T1, T2, . . . , Tn)may constitute a vector space, and the entry can be mapped to a point inthe vector space. As all of the initial request message and the presetknowledge subjects in the subject domain database can be mapped to thevector space and be represented by entry vectors (T1, W1, T2, W2, . . ., Tn, Wn), a matching problem of sentence information can be transformedto be a matching problem of vectors in a vector space. Specifically, forthe initial request message and the preset knowledge subjects in thesubject domain database, the semantic similarity calculation result is aratio between angles corresponding to the two vectors, namely, the ratiobetween a first angle (a first feature value) of an entry vectorcorresponding to the preset knowledge subject in the vector space and asecond angle (a second feature value) of an entry vector correspondingto the initial request message in the vector space. The more the ratiois close to 1, the higher the similarity of the two entries is. In otherembodiments, the similarity of the two entries may be represented by anintersection angle between vectors. The smaller the intersection angleis, the higher the similarity of the two entries is. The intersectionangle (similarity value) is a difference value of the first angle (afirst feature value) of the entry vector corresponding to the presetknowledge subject in the vector space and the second angle (a secondfeature value) of the entry vector corresponding to the initial requestmessage in the vector space.

It should be noted that, before performing the semantic similaritycalculation, intelligent-knowledge-base constructing device 104 mayperform a message preprocessing on the initial request message and thepreset knowledge subjects in the subject domain database, so as toextract representative features from the initial request message and thepreset knowledge subjects in the subject domain database. Therepresentative features can be used as a basis of similarity calculationto improve the accuracy of the similarity calculation. In someembodiments, the message preprocessing includes a word segmentationprocessing and a stop word removal processing. Further, the textpreprocessing may further include removing some of meaningless words,for example, “I”, “want”, “what”, etc.

In some embodiments, intelligent-knowledge-base constructing device 104may perform word segmentation processing based on word segmentationprinciples. The word segmentation principles may be preset in thesystem. When the word segmentation processing is performed, the presetword segmentation principles is invoked to perform the word segmentationprocessing on the initial request message and corresponding presetknowledge subjects in the subject domain database, so as to form twoentry strings constituted by entries.

A stop word list is pre-established in the stop word removal processing.When the stop word removal processing is performed, a matching processis performed between each entry and entries in the stop word list. Ifthe entry exists in the stop word list, the entry should be deleted fromthe entry strings after the word segmentation processing.

The word segmentation processing may use a maximum matching method, abest matching method, a word-by-word traversal method, a word frequencystatistics method, or other suitable word segmentation method. The stopword removal processing is used to remove some words which are not easyto distinguish and appear frequently. These words may introduce greaterrors in the process of similarity calculation, and may be regarded asa kind of noise, such as “this”, “of”, “and”, etc.

Taking how to register a CRBT service as an example, the semanticsimilarity calculation is performed between the received initial requestmessage “how can I register a CRBT service” and some preset knowledgesubjects in the subject domain database.

A semantic similarity calculation performed between the initial requestmessage “how can I register a CRBT service” and preset knowledgesubjects (“how to register a CRBT service,” “what's the step forregistering a CRBT service,” “what should I do if I want to register aCRBT service,” and “how does a CRBT service be registered”) is taken asan example. Semantic similarity calculations are performed between theinitial request message “how can I register a CRBT service” and “how toregister a CRBT service,” “what's the step for registering a CRBTservice,” “what should I do if I want to register a CRBT service,” and“how does a CRBT service be registered,” respectively, so as to obtainfour semantic similarity calculation values. The largest one of the foursemantic similarity calculation values is defined as a semanticsimilarity calculation result.

As there are a plurality of present knowledge subjects in the subjectdomain database, a plurality of semantic similarity calculation resultsmay be obtained correspondingly.

At step 408, intelligent-knowledge-base constructing device 104determines whether the largest one of the plurality of semanticsimilarity calculation results is greater than a similarity thresholdvalue. When the largest one of the plurality of semantic similaritycalculation results is greater than the similarity threshold value, themethod proceeds to step 410.

The similarity threshold value is preset. In one embodiment, thesimilarity threshold value is greater than or equal to 0.7, and is lessthan or equal to 1.0. It should be noted that, the similarity thresholdvalue may be other values.

When the largest one of the plurality of semantic similarity calculationresults is greater than the similarity threshold value, the methodproceeds to step 410.

At step 410, intelligent-knowledge-base constructing device 104 storesinto intelligent knowledge base 108 the initial request message, and thestandard question and the one or more extended questions of the presetknowledge subject corresponding to the largest one of the plurality ofsemantic similarity calculation results.

In some embodiments, when the initial request message and the standardquestion and the one or more extended questions of a preset knowledgesubject corresponding to the largest one of the plurality of semanticsimilarity calculation results are stored into the intelligent knowledgebase, the initial request message is stored as a new standard question,while the standard question and the one or more extended questions ofthe preset knowledge subject corresponding to the largest one of theplurality of semantic similarity calculation results are stored as newextended questions for the new standard question.

In some embodiment, when the largest one of the plurality of semanticsimilarity calculation results is greater than the similarity thresholdvalue, and the initial request message and the standard question and theone or more extended questions of a preset knowledge subjectcorresponding to the largest one of the plurality of semantic similaritycalculation results are stored into the intelligent knowledge base, ananswer corresponding to the initial request message is provided and isstored into the intelligent knowledge base along with the questions. Theprovided answer corresponding to the initial request message is ananswer provided by the user himself. The answer is corresponding to theinitial request message. Thus, the answer stored in the intelligentknowledge base is more accurate.

In another embodiment, when the largest one of the plurality of semanticsimilarity calculation results is greater than the similarity thresholdvalue, and the initial request message and the standard question and theone or more extended questions of a preset knowledge subjectcorresponding to the largest one of the plurality of semantic similaritycalculation results are stored into the intelligent knowledge base, ananswer stored in the subject domain database is stored into theintelligent knowledge base along with the questions. Thus, theefficiency for establishing the intelligent knowledge base is muchimproved.

If the largest one of the plurality of semantic similarity calculationresults is greater than the similarity threshold value, a similaritybetween the initial request message and the corresponding presetknowledge subject is very high, which means the user wants to store thispreset knowledge subject into the intelligent knowledge base. In themethod of the present embodiments, after receiving the initial requestmessage input by the user, performing the semantic similaritycalculation, and comparing the similarity calculation result with thesimilarity threshold value, the preset knowledge subject reaching thesimilarity threshold value and the corresponding initial request messageare stored in the intelligent knowledge base. Thus, in the process forconstructing the intelligent knowledge base, the user doesn't need toinput a plurality of related questions into the intelligent knowledgebase. Therefore, the efficiency for constructing the intelligentknowledge base is improved.

Taking how to register a CRBT service as an example to explain, asimilarity calculation result between the initial request message (“howcan I register a CRBT service”) and the preset knowledge subject (“howto register a CRBT service”, “what's the step for registering a CRBTservice”, “what should I do if I want to register a CRBT service”, “howdoes a CRBT service be registered”) is greater than the similaritythreshold value, and the preset knowledge subject (“how to register aCRBT service”, “what's the step for registering a CRBT service”, “whatshould I do if I want to register a CRBT service”, “how does a CRBTservice be registered”) is stored in the intelligent knowledge base.

In some embodiments, when the largest one of the plurality of semanticsimilarity calculation results is equal to the similarity thresholdvalue, the method goes to step 410.

As the number of initial request message inputted by the user arelimited, it is difficult for the initial request messages inputted bythe user to correspond to all the preset knowledge subjects in thesubject domain database. Also, it is difficult to store all the presetknowledge subjects in the subject domain database into the intelligentknowledge base through step 406-410. Therefore, after stopping receivinginitial request messages, intelligent-knowledge-base constructing device104 may extract at least parts of preset knowledge subjects which arenot stored in the intelligent knowledge base, and store the extractedpreset knowledge subjects into the intelligent knowledge base, whereinthe preset knowledge subjects includes answers.

For example, the subject domain database includes 1000 preset knowledgesubjects. All the largest ones of the plurality of results obtained byperforming semantic similarity calculations on 500 initial requestmessage inputted by the user and 500 preset knowledge subjects aregreater than the similarity threshold value. Thus, all the standardquestion and extended questions of the 500 preset knowledge subjects arestored in the intelligent knowledge base. However, the remaining 500preset knowledge subjects are not included in the intelligent knowledgebase. Because the preset knowledge subjects in the subject domaindatabase are common knowledge of a filed, the remaining 500 presetknowledge subjects can be efficiently used. If no initial requestmessage is inputted, the remaining 500 preset knowledge subjects may bedirectly stored in the intelligent knowledge base, so as to fill 500 newknowledge nodes into the intelligent knowledge base.

It should be noted that, when the preset knowledge subject is stored inthe intelligent knowledge base, the standard question of the presetknowledge subject serves as a standard question of the correspondingknowledge node in the intelligent knowledge base, the extended questionof the preset knowledge subject serves as an extended question of thecorresponding knowledge node in the intelligent knowledge base, and theanswer of the preset knowledge subject serves as an answer of thecorresponding knowledge node in the intelligent knowledge base. Thus,the efficiency of constructing the intelligent knowledge base isimproved in the meantime the subject domain database is effectivelyused.

In order to avoid a situation that the preset knowledge subjects of thesubject domain database don't meet the requirements of the intelligentknowledge base, a screening process may be performed on the presetknowledge subjects which are not stored in the intelligent knowledgebase, such that only parts of the remaining preset knowledge subjectsare stored in the intelligent knowledge base and the accuracy of theintelligent knowledge base is ensured.

FIG. 5 illustrates a flow chart of another exemplary method forintelligent-knowledge-base constructing device 104 of FIG. 2C or 2D toconstruct an intelligent knowledge base according to embodiments of thepresent disclosure. The flow chart includes steps 502 through 508, inaddition to steps 402 through 410 disclosed in FIG. 4 as describedabove. The detailed description of steps 402 through 410 is providedabove and is not repeated here.

As shown in FIG. 5, At step 408, intelligent-knowledge-base constructingdevice 104 determines whether the largest one of the plurality ofsemantic similarity calculation results is greater than a similaritythreshold value. When the largest one of the plurality of semanticsimilarity calculation results is greater than the similarity thresholdvalue, the method proceeds to step 410, as described above. When thelargest one of the plurality of semantic similarity calculation resultsis equal to the similarity threshold value, in some embodiments, themethod proceeds to step 410, while in other embodiments, the method mayproceed to step 502. When the largest one of the plurality of semanticsimilarity calculation results is smaller than the similarity thresholdvalue, the method proceeds to step 502.

As there is a wide variety of knowledge in the process for establishingthe intelligent knowledge base and the intelligent knowledge base maycorrespond to different fields, the received initial request message maybe various. Because it is impossible for the corresponding knowledgebase to include all the knowledge subjects, there are some limitationsto establish the intelligent knowledge base through the similaritycalculation method. In order to further improve the efficiency forestablishing the intelligent knowledge base, when the largest one of theplurality of semantic similarity calculation results is smaller than thesimilarity threshold value, the process for establishing the intelligentknowledge base proceeds to step 502.

For example, in one embodiment, when the received initial requestmessage is “how to open a credit card of the Bank of Communications(BOC) through online banking”, a similarity calculation result, obtainedby performing a semantic similarity calculation between the initialrequest message “how to open a credit card of the Bank of Communications(BOC) through online banking” and preset knowledge subjects in theknowledge database, may be smaller than the similarity threshold value.However, the user just wants to establish a knowledge subject related to“how to open a credit card of the BOC through online banking” in theintelligent knowledge base. Thus, another method to construct theintelligent knowledge base is provided in following embodiments of thepresent disclosure. When the similarity calculation result is smallerthan the similarity threshold value, the method can further improve theefficiency for establishing the intelligent knowledge base.

At step 502, intelligent-knowledge-base constructing device 104 obtainsa plurality of abstract semantic expressions from an abstract semanticdatabase 110. An abstract semantic expression includes asemantic-lacking element. Subsequently, an element is filled into aplace corresponding to the semantic-lacking element in the abstractsemantic expression, so as to obtain a specific semantic expression.

An abstract semantic expression may include not only a semantic-lackingelement, but also a semantic rule word. In some embodiments, thesemantic rule word is marked with wordclass information. The wordclassinformation indicates that the semantic rule word belongs to awordclass. A wordclass includes several key words having a same usageand a similar semantic meaning.

The abstract semantic expression may only include a semantic-lackingelement. The abstract semantic expression in this form is defined as adefault set. The abstract semantic expression may include a plurality ofsemantic-lacking element. Each semantic-lacking element has acorresponding property, and different semantic-lacking element havedifferent properties. The property of the semantic-lacking elementdefines a property of the corresponding element used to fill thesemantic-lacking element. That is, only the portion of the initialrequest message, which meets the requirements of the property of thesemantic-lacking element, can be filled into the semantic-lackingelement, so as to form the specific semantic expression.

For example, in some embodiments, the abstract semantic expressionstored in the abstract semantic database includes: through [concept1][action] [concept2] ($ how) transact; through [concept] transact ($how)transact; [concept2] ($ how) through [concept1] transact; ($ how)through [concept] transact; through [concept] ($ how) transact; through[concept1] ($ how) transact [concept2]; through [concept] [action] ($how) transact; [concept2] through [concept1] ($ how) transact; through[concept1] ($ how) open [concept2]; through [concept1] ($ how) [action][concept2]; [action1] [concept1] ($ how) [action2] [concept2]; [action1][concept1] ($ how) [action2] [concept2]; where can [action] [concept];[action] [concept] step; [concept1] [action] [concept2].

In above semantic expressions, “[ ]” represents the semantic-lackingelement, and content of “[ ]” represents the property of thesemantic-lacking element. Other elements of the semantic expressionsrepresent the semantic rule word. Specifically, in the aboveexpressions, “[concept],” “[concept1],” “[concept2],” “[action],”“[action1],” “[action2]” represent the semantic-lacking elements.Content of “[ ],” “concept,” “concept1,” “concept2,” “action,”“action1,” “action2” represents the properties of the correspondingsemantic-lacking elements. Wherein, “concept” indicates thesemantic-lacking element [concept] is a semantic-lacking element havinga concept property. The element used to fill this semantic-lackingelement in subsequent step at least includes a single word having nounproperty from the initial request message, or includes a combination ofa single word having noun property from the initial request message andone or more single word having other word property. “concept1”represents the semantic-lacking element “[concept1]” is the firstsemantic-lacking element having a concept property, wherein acombination of “concept” and “1” represents the property of thesemantic-lacking element, “concept” represents the concept property, and“1” represents the location property, namely, the first. Subsequently,the element used to fill this semantic-lacking element at least includesthe first single word having noun property from the initial requestmessage, or includes a combination of the first single word having nounproperty from the initial request message and one or more single wordhaving other word property. “concept2” represents the semantic-lackingelement “[concept2]” is the second semantic-lacking element having aconcept property. Subsequently, the element used to fill thissemantic-lacking element at least includes the second single word havingnoun property from the initial request message, or includes acombination of the second single word having noun property from theinitial request message and one or more single word having other wordproperty. “action” represents the semantic-lacking element “[action]” isa semantic-lacking element having an action property. Subsequently, theelement used to fill this semantic-lacking element at least includes asingle word having action property from the initial request message, orincludes a combination of the single word having action property fromthe initial request message and one or more single words having otherword property. “action1” represents the semantic-lacking element“[action1]” is the first semantic-lacking element having an actionproperty. Subsequently, the element used to fill this semantic-lackingelement at least includes the first single word having action propertyfrom the initial request message, or includes a combination of the firstsingle word having action property from the initial request message andone or more single words having other word property. “action2”represents the semantic-lacking element “[action2]” is the secondsemantic-lacking element having an action property. Subsequently, theelement used to fill this semantic-lacking element at least includes thesecond single word having action property from the initial requestmessage, or includes a combination of the second single word havingaction property from the initial request message and one or more singlewords having other word property.

Except the semantic-lacking elements, other elements of the semanticexpression, such as “through,” “($ how),” “transact,” “open,” “step,”etc, represents semantic rule words, wherein the semantic rule word “($how)” represents the semantic rule word “how” belongs to a wordclass “$how.” In one embodiment, the wordclass “$ how” includes key words:“how,” “what,” “how about,” “what about.” The wordclass can beestablished at a same time as the abstract semantic expression.Correspondingly, the semantic rule word “through” belongs to a wordclass“$ through.” In one embodiment, the wordclass “$ open” includes keywords: “open,” “transact,” “order,” “apply.” Subsequently, when thesemantic-lacking element is filled to form the specific semanticexpression, a semantic rule word in a wordclass can be replaced by otherkey words in the same wordclass.

In above semantic expressions, the semantic expression “[concept1][action] [concept2]” is defined as a default set.

It should be noted that, the expressions of the semantic-lacking elementin the abstract semantic expression and the expressions of the wordclassinformation are used to facilitate the description of the embodiments,and are taken as examples. The scope of the present disclosure is notlimited therein. In other embodiments of the present disclosure, thesemantic-lacking element in the abstract semantic expression and thewordclass information can be expressed in other forms.

Referring back to FIG. 5, at step 504, intelligent-knowledge-baseconstructing device 104 performs an abstract semantic recommendingprocess on the initial request message based on the abstract semanticdatabase, so as to obtain one or more abstract semantic expressionscorresponding to the initial request message.

The aim of the abstract semantic recommending process is to select oneor more abstract semantic expressions corresponding to the initialrequest message from the abstract semantic database, such that a portionof the initial request message can be filled in a correspondingsemantic-lacking element of the one or more abstract semanticexpressions to obtain one or more specific semantic expressions. The oneor more specific semantic expressions have a same or similar meaning asthe initial request message. Subsequently, the obtained one or morespecific semantic expressions and their corresponding initial requestmessage may be stored into the intelligent knowledge base, such that theuser only needs to input one initial request message. The method of thepresent disclosure can extend the initial request message automaticallyto obtain several messages (one or more specific semantic expressions)corresponding to the initial request message, which may be stored in theintelligent knowledge base subsequently. Therefore, the efficiency forestablishing the intelligent knowledge base can be further improved.

In one embodiment, the initial request message may be used as thestandard question, and the specific semantic expressions may be used asthe corresponding extended questions. When the initial request messageis stored into the intelligent knowledge base, an answer correspondingto the initial request message is provided, and is stored into theintelligent knowledge base together. Thus, a knowledge point can beformed in the intelligent knowledge base, and the knowledge point may becontinually updated and optimized in subsequent steps.

FIG. 6 illustrates a flow chart of an exemplary embodiment of anabstract semantic recommending process 504 of FIG. 5, according toembodiments of the present disclosure. As shown in FIG. 6, in someembodiments, step 504 may include five sub-steps 504 a-504 d and 504 h.

At step 504 a, intelligent-knowledge-base constructing device 104performs a word segmentation process on the initial request message toobtain one or more single words. In some embodiments, the same wordsegmentation process may be performed at step 406 once the initialrequest message is received. In that case, there is no need to repeatthe word segmentation process at step 504, and the result of the wordsegmentation process step 406 can be directly used. In otherembodiments, the word segmentation process at step 406 and the wordsegmentation process at step 504 are different. That is, when step 504is performed, a word segmentation process may be performed on theinitial request message again.

For example, when receive message “how to open a credit card throughonline banking” as an initial request message,intelligent-knowledge-base constructing device 104 performs a wordsegmentation process as follows. After a word segmentation process isperformed on this initial request message, one or more single words canbe obtained, such as “through,” “online banking,” “how,” “open,” and“credit card.”

At step 504 b, intelligent-knowledge-base constructing device 104performs a part-of-speech tagging process on each of those single words,to obtain part-of-speech information of each single word. The aim of thepart-of-speech tagging process performed on the single word is to obtainproperty information of each single word, and to provide a basis for amatching process performed on the inputted initial request message andthe abstract semantic expressions in subsequent steps.

Specifically, in some embodiments, the part-of-speech of the single word“through” is marked as a first verb or a preposition, the part-of-speechof the single word “online banking” is marked as a first noun, thepart-of-speech of the single word “how” is marked as a pronoun, thepart-of-speech of the single word “open” is marked as a second verb, andthe part-of-speech of the single word “credit card” is marked as asecond noun. It should be noted that, the first noun marked by thepart-of-speech tagging process means the single word “online bank” isthe first noun having a noun word property, which is similar to theexplanation of the second noun, the first verb and the second verb.

In another embodiment, the part-of-speech of the single word “through”is marked as a verb or a preposition, the part-of-speech of the singleword “online banking” is marked as a first noun, the part-of-speech ofthe single word “how” is marked as a pronoun, the part-of-speech of thesingle word “open” is marked as a verb, and the part-of-speech of thesingle word “credit card” is marked as a second noun.

In the part-of-speech tagging process, the context of the semanticenvironment should be considered, so as to improve the accuracy of thepart-of-speech tagging process.

At step 504 c, intelligent-knowledge-base constructing device 104performs a wordclass determination process on each single word, so as toobtain the wordclass information of each single word. The aim ofperforming the wordclass determination process on each single word is todetermine whether each single word has a corresponding wordclass. Insome embodiments, the wordclass determination process may include:matching each single word with a plurality of wordclasses in a wordclasslibrary; if the single word exists in a wordclass, determining thesingle word belonging to the wordclass; and marking the single word toindicate the single word belonging to the wordclass (or wordinformation). In the subsequent matching process, by determining whethera part of the content of the initial request message and a correspondingsemantic rule word of the abstract semantic expression belong to a samewordclass, a matching degree of the initial request message and theabstract semantic expression is determined. Therefore, the accuracy andthe efficiency of the matching process are improved.

For example, a wordclass determination process is performed on thesingle words “through,” “online banking,” “how,” “open,” and “creditcard.” Based on the determination, the single word “how” has acorresponding wordclass “$how.” The wordclass “$how” includes key words:“how,” “what,” “how about,” “what about.” A marking process indicatingthe single word “how” belonging to the wordclass “$how” is performed.The single word “open” has a corresponding wordclass “$open.” Thewordclass “$open” includes key words: “open,” “transact,” “order,”“apply.” In the subsequent step for filling the semantic-lacking elementto obtain specific semantic expressions, if a single word having awordclass is filled to the semantic-lacking element, other key words ofthe wordclass may be used to replace the single word and to fill thecorresponding semantic-lacking element.

At step 504 d, intelligent-knowledge-base constructing device 104performs a searching process on the abstract semantic database to obtainan abstract semantic candidate set relevant to the initial requestmessage. The abstract semantic candidate set includes a plurality ofabstract semantic expressions. The aim of performing the searchingprocess on the abstract semantic database to obtain an abstract semanticcandidate set relevant to the initial request message, is to reduce theburden of the subsequent matching process, reduce the processing time,and improve the system performance.

At least parts of abstract rule words of the abstract semanticexpressions in the abstract semantic candidate set are the same, orbelong to a same wordclass, as parts of the single words of the initialrequest message. In one embodiment, the searching process is performedto determine whether at least parts of abstract rule words of theabstract semantic expressions in the abstract semantic database are thesame, or belong to a same wordclass, as at least parts of single wordsof the initial request message. If at least parts of abstract rule wordsof an abstract semantic expression are the same, or belong to a samewordclass as at least parts of single words of the initial requestmessage, the abstract semantic expression is determined to be oneabstract semantic expression of the abstract semantic candidate set. Inother embodiments, other searching methods may be used to search theabstract semantic database to obtain the abstract semantic candidate setrelevant to the initial request message.

For example, a searching process is performed on the abstract semanticdatabase to obtain an abstract semantic candidate set relevant to theinitial request message: “how to open a credit card through onlinebanking.” The abstract semantic candidate set includes abstract semanticexpressions: through [concept1] [action] [concept2] ($how) transact;through [concept] transact ($how) transact; [concept2] ($how) through[concept1] transact; ($how) through [concept] transact; through[concept] ($how) transact; through [concept1] ($how) transact[concept2]; through [concept1] ($how) open [concept2]; through [concept][action] ($how) transact; through [concept1] ($how) open [concept2];[concept2] through [concept1] ($how) transact; through [concept1] ($how)[action] [concept2]; [action1] [concept1] ($how) [action2] [concept2];[action1] [concept1] ($how) [action2] [concept2]. Parts of abstract rulewords (through, through, transact or ($how)) of the abstract semanticexpressions in the above abstract semantic candidate set are the same,or belong to a same wordclass as at least parts of the single words(through, transact or how) of the initial request message.

At step 504 h, intelligent-knowledge-base constructing device 104performs a matching process on the abstract semantic expressions in theabstract semantic candidate set based on the part-of-speech informationand the wordclass information, so as to obtain an abstract semanticexpression corresponding to the initial request message.

In some embodiments, the abstract semantic expression corresponding tothe initial request message satisfies the following conditions: thepart-of-speech (or property) corresponding to the semantic-lackingelement includes the part-of-speech of the corresponding fillingelement; single words of the initial request message except the fillingelement are the same or belong to a same wordclass as the abstract rulewords; the abstract semantic expression has a same order as the initialrequest message. The matching process is performed based on aboveconditions. If one abstract semantic expression of the abstract semanticcandidate set satisfies all the three conditions, the abstract semanticexpression is an abstract semantic expression corresponding to theinitial request message. That is, in the matching process, whether anabstract semantic expression is the abstract semantic expressioncorresponding to the initial request message is determined based on theabove conditions.

In other embodiments, the abstract semantic expression corresponding tothe initial request message may only satisfy one or two aboveconditions. Specifically, the abstract semantic expression correspondingto the initial request message satisfies the following condition: thepart-of-speech (or property) corresponding to the semantic-lackingelement includes the part-of-speech of the corresponding fillingelement; or the abstract semantic expression corresponding to theinitial request message satisfies the following conditions: thepart-of-speech (or property) corresponding to the semantic-lackingelement includes the part-of-speech of the corresponding fillingelement, and single words of the initial request message except thefilling element are the same or belong to a same wordclass as theabstract rule words; or the abstract semantic expression correspondingto the initial request message satisfies the following conditions: thepart-of-speech (or property) corresponding to the semantic-lackingelement includes the part-of-speech of the corresponding fillingelement; and the abstract semantic expression has a same order as theinitial request message.

For example, the abstract semantic expression, obtained through thematching processing and being corresponding to the initial requestmessage “how to open a credit card through online banking”, includes:through [concept1] ($how) [action] [concept2]. The semantic-lackingelement [concept1] of the abstract semantic expression is correspondingto the single word “online banking”, the semantic-lacking element[action] is corresponding to the single word “open”, thesemantic-lacking element [concept2] is corresponding to the single word“credit card”, and the semantic-lacking element [action2] iscorresponding to the single word “open”. In another abstract semanticexpression [concept1] ($how) open [concept2], the semantic-lackingelement [concept1] is corresponding to the single word “online banking”,and the semantic-lacking element [concept2] is corresponding to thesingle word “credit card”.

In other embodiments, if the abstract semantic expression correspondingto the initial request message cannot be obtained, extended questionscorresponding to the initial request message can be manually filled intothe intelligent knowledge base.

FIG. 7 illustrates a flow chart of another exemplary embodiment of anabstract semantic recommending process 504 of FIG. 5, according toembodiments of the present disclosure. As shown in FIG. 7, in someembodiments, step 504 may further include sub-steps 504 e, 504 f, and504 g, in addition to sub-steps 504 a through 504 d and 504 h asdisclosed in FIG. 6. In FIG. 7, after intelligent-knowledge-baseconstructing device 104 performs step 504 d and before performing step504 h, it performs steps 504 e, 504 f, and 504 g. The aim of thisembodiment is to prevent obtaining too many or too few abstract semanticexpressions.

Steps 504 e, 504 f, and 504 g will be described in detail below. Thedetailed description of steps 504 a through 504 d and 504 h is providedabove and is not repeated here.

After intelligent-knowledge-base constructing device 104 performs asearching process on the abstract semantic database to obtain anabstract semantic candidate set, which is relevant to the initialrequest message and includes a plurality of abstract semanticexpressions, device 104 performs step 504 e to determine whether thenumber of the abstract semantic expressions in the abstract semanticcandidate set is within a predetermined range. When the number of theabstract semantic expressions in the abstract semantic candidate set isabove the predetermined range, step 504 f is performed to remove partsof the abstract semantic expressions. When the number of the abstractsemantic expressions in the abstract semantic candidate set is under thepredetermined range, step 504 g is performed to supplement parts ofabstract semantic expressions from the default set. When the number ofthe abstract semantic expressions in the abstract semantic candidate setis within the predetermined range, step 504 h is performed to perform amatching process on the abstract semantic expressions in the abstractsemantic candidate set based on the word properties and the wordclasses,to obtain an abstract semantic expression corresponding to the initialrequest message.

The predetermined range may be preconfigured. The predetermined rangemay be a specific value or a value range.

When the number of the abstract semantic expressions in the abstractsemantic candidate set is above the predetermined range, parts of theabstract semantic expressions are removed, and then the remainingabstract semantic expressions in the abstract semantic candidate set areused in the subsequent step 504 h. In one embodiment, parts of theabstract semantic expressions may be randomly removed, or may be removedaccording to a certain rule. For example, the abstract semanticexpressions having odd serial numbers (or even serial numbers) may beremoved, or one or more abstract semantic expressions are removed withan interval of a predetermined number of abstract semantic expressions.

When the number of the abstract semantic expressions in the abstractsemantic candidate set is under the predetermined range, parts ofabstract semantic expressions from the default set is supplemented tothe abstract semantic candidate set. Then the abstract semanticcandidate set supplemented with the parts of abstract semanticexpressions from the default set is used in the subsequent step 504 h.

FIG. 8 illustrates a flow chart of yet another exemplary embodiment ofan abstract semantic recommending process 504 of FIG. 5, according toembodiments of the present disclosure. As shown in FIG. 8, in someembodiments, step 504 may further include sub-step 504 i, in addition tosub-steps 504 a through 504 d and 504 h as disclosed in FIG. 6.

Step 504 i will be described in detail below. The detailed descriptionof steps 504 a through 504 d and 504 h are provided above and are notrepeated here.

A difference between this embodiment disclosed in FIG. 8 and thatembodiment disclosed in FIG. 6 is that, before the part-of-speechtagging process and the wordclass determination process, a step foridentifying each single word as a semantic rule word or a non-semanticrule word is performed. Then, a part-of-speech tagging process isperformed on each identified non-semantic rule word, and a wordclassdetermination process is performed on each identified semantic rulewords. Thus, the targets of the part-of-speech tagging process and thewordclass determination process are respectively parts of the singlewords. Therefore, the processing time for the part-of-speech taggingprocess and the wordclass determination process is reduced, and theprocessing efficiency is improved.

At step 504 i, intelligent-knowledge-base constructing device 104identifies each single word as a semantic rule word or a non-semanticrule word. In some embodiments, a process for identifying a single wordas a semantic rule word or a non-semantic rule word may include:providing a semantic rule word database including a plurality ofsemantic rule words; determining whether the one or more single wordsexist in the semantic rule word database; if a single word exists in thesemantic rule word database, identifying the single word as a semanticrule word; and if a single word doesn't exist in the semantic rule worddatabase, identifying the single word as a non-semantic rule word.

For example, the provided semantic rule word database includes aplurality of semantic rule words: “through,” “how,” “what,” “whatabout,” etc. An example of the initial request message is “how to open acredit card through online banking.” A plurality of single words,“through,” “online banking,” “how,” “open,” and “credit card,” can beobtained through the word segmentation process. It is determined whetherthe single words “through,” “online banking,” “how,” “open,” and “creditcard” exist in the semantic rule word database. Based on thedetermination, the single word “how” exists in the semantic rule worddatabase, such that the single word “how” is identified as a semanticrule word; the single word “through” exists in the semantic rule worddatabase, such that the single word “through” is identified as asemantic rule word. However, the single words “online banking,” “open,”and “credit card” don't exist in the semantic rule word database, suchthat, the single words “online banking,” “open,” and “credit card” areidentified as non-semantic rule word. In the subsequent wordclassdetermination process, the wordclass determination process is onlyperformed on the semantic rule words “how” and “through” to obtainwordclass information of the semantic rule words “how” and “through.” Inthe subsequent part-of-speech tagging process, the part-of-speechtagging process is only performed on the non-semantic rule words “onlinebanking,” “open,” and “credit card” to obtain wordclass information ofthe non-semantic rule words “online banking,” “open,” and “credit card.”

In some embodiments, after intelligent-knowledge-base constructingdevice 104 performs step 504 d and before performing step 504 h, itdetermines whether the number of the abstract semantic expressions inthe abstract semantic candidate set is within a predetermined range. Ifthe number of the abstract semantic expressions in the abstract semanticcandidate set is above the predetermined range,intelligent-knowledge-base constructing device 104 removes parts of theabstract semantic expressions. If the number of the abstract semanticexpressions in the abstract semantic candidate set is under thepredetermined range, intelligent-knowledge-base constructing device 104supplements parts of abstract semantic expressions from the default set.

Referring back to FIG. 5, at step 506, when intelligent-knowledge-baseconstructing device 104 obtains one or more abstract semanticexpressions corresponding to the initial request message, it extracts anelement corresponding to the semantic-lacking element of the one or moreabstract semantic expressions from the initial request message. Anddevice 104 fills the extracted element into the semantic-lacking elementto obtain one or more specific semantic expressions corresponding to theinitial request message.

For example, in some embodiments, abstract semantic expressions matchingto the initial request message “how to open a credit card through onlinebanking” may include:

through [concept1] ($how) [action] [concept2], wherein the single word“online banking” is extracted from the initial request message and isfilled to the corresponding semantic-lacking element [concept1], thesingle word “open” is extracted from the initial request message and isfilled to the corresponding semantic-lacking element [action], and thesingle word “credit card” is extracted from the initial request messageand is filled to the corresponding semantic-lacking element [concept2],so as to form a specific semantic expression: through online banking($how) ($open) credit card, and wherein ($how) indicates the semanticrule word “how” may be replaced by a key word: “what”, “how about” or“what about”, and ($open) indicates the single word “open” may bereplaced by “transact”, “order” or “apply”;

[action1] [concept1] ($how) [action2] [concept2], wherein the singleword “online banking” is extracted from the initial request message andis filled to the corresponding semantic-lacking element [concept1], thesingle word “through” is extracted from the initial request message andis filled to the corresponding semantic-lacking element [action1], thesingle word “credit card” is extracted from the initial request messageand is filled to the corresponding semantic-lacking element [concept2],and the single word “open” is extracted from the initial request messageand is filled to the corresponding semantic-lacking element [action2],so as to form a specific semantic expression: through online banking($how) open credit card; and

through [concept1] ($how) open [concept2], wherein the single word“online banking” is extracted from the initial request message and isfilled to the corresponding semantic-lacking element [concept1], and thesingle word “credit card” is extracted from the initial request messageand is filled to the corresponding semantic-lacking element [concept2],so as to form a specific semantic expression: through online banking($how) open credit card.

In some embodiments, before step 506 is performed, if a number M of theabstract semantic expressions corresponding to the initial requestmessage is larger than a number N of the specific semantic expressionswhich need to be stored in the intelligent knowledge base,intelligent-knowledge-base constructing device 104 further performs ascoring process on the abstract semantic expressions corresponding tothe initial request message.

FIG. 9 illustrates a flow chart of an exemplary embodiment of step 506(of FIG. 5) for extracting from an initial request message an elementcorresponding to a semantic-lacking element of an abstract semanticexpression, and filling the extracted element to the semantic-lackingelement to obtain one or more specific semantic expressionscorresponding to the initial request message, according to embodimentsof the present disclosure. As shown in FIG. 9, step 506 includessub-steps 506 a through 506 d.

At step 506 a, intelligent-knowledge-base constructing device 104determines whether a number M of the abstract semantic expressionscorresponding to the initial request message is larger than a number Nof the specific semantic expressions which need to be stored in theintelligent knowledge base. If the number M of the abstract semanticexpressions corresponding to the initial request message is greater thanthe number N of the specific semantic expressions that need to be storedin the intelligent knowledge base, device 104 performs step 506 c. Ifthe number M of the abstract semantic expressions corresponding to theinitial request message is smaller than the number N of the specificsemantic expressions which need to be stored in the intelligentknowledge base, device 104 performs step 506 b.

At step 506 c, intelligent-knowledge-base constructing device 104performs a scoring process on each abstract semantic expressioncorresponding to the initial request message, and then proceeds to step506 d.

At step 506 d, intelligent-knowledge-base constructing device 104extracts, from the initial request message, elements corresponding tosemantic-lacking elements of N abstract semantic expressions havinghigher score. It then fills the extracted elements into correspondingsemantic-lacking elements of the N abstract semantic expressions havinghigher score, so as to obtain N specific semantic expressionscorresponding to the initial request message. Afterwards, device 104proceeds to step 508.

At step 506 b, intelligent-knowledge-base constructing device 104extracts, from the initial request message, elements corresponding tosemantic-lacking elements of the M abstract semantic expressions havinghigher score. And device 104 then fills the extracted elements intocorresponding semantic-lacking elements of the M abstract semanticexpressions having higher score, so as to obtain M specific semanticexpressions corresponding to the initial request message. Afterwards,device 104 proceeds to step 508.

It should be noted that, when the number M of the abstract semanticexpressions corresponding to the initial request message is equal to thenumber N of the specific semantic expressions which need to be stored inthe intelligent knowledge base, either step 506 b or step 506 c may beperformed.

The aim to perform the scoring process at step 506 c is to store apredetermined number of specific semantic expressions, which are bestmatched to the initial request message, into the intelligent knowledgebase. In some embodiments, intelligent-knowledge-base constructingdevice 104 may employ one method or a combination of more than onemethod described below to perform the scoring process.

Method one: intelligent-knowledge-base constructing device 104 sets ahigher score to the abstract semantic expression if the number of thematched semantic-lacking elements is higher.

For example, through the matching process, an abstract semanticexpression corresponding to the initial request message “how to open acredit card through online banking” can be obtained. The abstractsemantic expression is: through [concept1] ($how) [action] [concept2],wherein the semantic-lacking element [concept1] in the abstract semanticexpression corresponds to the single word “online banking,” thesemantic-lacking element [action] corresponds to the single word “open,”and the semantic-lacking element [concept2] corresponds to the singleword “credit card.” That is, each semantic-lacking element has acorresponding filling element. Therefore, this number of the matchedsemantic-lacking elements of this abstract semantic expression is large,and the score of the abstract semantic expression is high.

Method two: intelligent-knowledge-base constructing device 104 defines asemantic-lacking element of the abstract semantic expression as a coresemantic element. Device 104 sets a higher score to the abstractsemantic expression if the semantic-lacking element is more close to thecore semantic element.

For example, through the matching process, an abstract semanticexpression corresponding to the initial request message “how to open acredit card through online banking” can be obtained. The abstractsemantic expression includes: a first abstract semantic expression:through [concept1] ($how) [action] [concept2], and a second abstractsemantic expression: [action1] [concept1] ($how) [action2] [concept2].

The semantic-lacking element [concept2] is defined as a core semanticelement. Because a distance from the semantic-lacking element [action1]to the core semantic element [concept2] in the second abstract semanticexpression is larger than a distance from the semantic-lacking element[action1] to the core semantic element [concept2] in the first abstractsemantic expression, a score of the semantic-lacking element [action1]in the first abstract semantic expression is greater than that in thesecond abstract semantic expression.

Method three: intelligent-knowledge-base constructing device 104 sets ahigher score to the abstract semantic expression if the confidence valueof the part-of-speech is higher. When a content constituted by one ormore single words is filled into a corresponding semantic-lackingelement of the abstract semantic expression, if a single word of thefilling content is a word having service property, the abstract semanticexpression has a higher score.

In one embodiment, when a content constituted by at least two singlewords is filled into a corresponding semantic-lacking element of theabstract semantic expression, if the single word at the end of thecontent has business property, the abstract semantic expression has ahigher score.

For example, if the content filled into a corresponding semantic-lackingelement of the abstract semantic expression is “personal credit card,”the content is constituted by two single words, “personal” and “creditcard.” The single word at the end of “personal credit card” is “creditcard,” and the single word “credit card” is a word having businessproperty, thus the abstract semantic expression has a higher score.

Method four: intelligent-knowledge-base constructing device 104 sets ahigher score to the abstract semantic expression if a priority level ishigher, wherein the priority level is pre-assigned to the abstractsemantic expression.

In the process for establishing the abstract semantic database, parts ofthe abstract semantic expressions in the abstract semantic database areassigned with a higher priority level. In the matching process, if theabstract semantic expression having a higher priority level is obtained,the abstract semantic expression having a higher priority level may hasa higher score.

For example, in the process for establishing the abstract semanticdatabase, the abstract semantic expression “through [concept1] ($how)[action] [concept2]” is assigned a higher priority level. In a specificembodiment, the abstract semantic expression is marked with a mark. Themark indicates the abstract semantic expression has a higher prioritylevel, or indicates the priority level of the abstract semanticexpression.

Based on the matching process, an abstract semantic expressioncorresponding to the initial request message “how to open a credit cardthrough online banking” is obtained. The abstract semantic expression is“through [concept1] ($how) [action] [concept2].” As the abstractsemantic expression has a higher priority level, the abstract semanticexpression has a higher score.

Method five: intelligent-knowledge-base constructing device 104 sets ahigher score to the abstract semantic expression if a probability ishigher. The probability is determined based on a natural language model,and corresponds to whether data information obtained by fillingsegmentation words of a large amount of crawled corpus data into theabstract semantic expression has correct semantic information.

Referring back to FIG. 5, at step 508, intelligent-knowledge-baseconstructing device 104 stores the initial request message and the oneor more specific semantic expressions into the intelligent knowledgebase.

In some embodiments, when the initial request message and the specificsemantic expressions are stored into the intelligent knowledge base, theinitial request message may be stored as a standard question, and thespecific semantic expressions may be stored as extended questions of thestandard question. Besides, when the initial request message and thespecific semantic expressions are stored in the intelligent knowledgebase, an answer corresponds to the initial request message is provided,and the answer is also stored in the intelligent knowledge base. Theanswer corresponding to the initial request message may be provided by auser.

Based on different configurations of the embodiment, maybe the specificsemantic expression having the highest score and its correspondinginitial request message are stored in the intelligent knowledge base, ormaybe a plurality of specific semantic expression having higher scoreand the corresponding initial request message are stored in theintelligent knowledge base.

FIG. 10 illustrates a flow chart of an exemplary method for constructingan intelligent knowledge base based on abstract semantic recommendation,according to embodiments of the present disclosure. Specifically, atstep 1002, an abstract semantic database such as database 110 isprovided. The abstract semantic database includes a plurality ofabstract semantic expressions, each of which includes a semantic-lackingelement. Intelligent-knowledge-base constructing device 104 obtains aplurality of abstract semantic expressions from the abstract semanticdatabase. At step 1004, device 104 receives an initial request message.At step 1006, device 104 performs an abstract semantic recommendingprocess on the initial request message based on the abstract semanticdatabase, so as to obtain one or more abstract semantic expressionscorresponding to the initial request message. At step 1008, after theone or more abstract semantic expressions corresponding to the initialrequest message are obtained, device 104 extracts, from the initialrequest message, an element corresponding to the semantic-lackingelement of the one or more abstract semantic expressions. And device 104fills the extracted element into the semantic-lacking element to obtainone or more specific semantic expressions corresponding to the initialrequest message. At step 1010, device 104 stores the initial requestmessage and the one or more specific semantic expressions into theintelligent knowledge base. The steps similar to steps 1002 through 1010are described in detail above.

It will now be appreciated by one of ordinary skill in the art that theillustrated methods can be altered to delete steps, change the order ofsteps, or include additional steps. The methods disclosed herein may beimplemented as a computer program product, i.e., a computer programtangibly embodied in an information carrier, e.g., in a machine readablestorage device, for execution by, or to control the operation of, dataprocessing apparatus, e.g., a programmable processor, a computer, ormultiple computers. A computer program can be written in any form ofprogramming language, including compiled or interpreted languages, andit can be deployed in any form, including as a standalone program or asa module, component, subroutine, or other unit suitable for use in acomputing environment. A computer program can be deployed to be executedon one computer or on multiple computers at one site or distributedacross multiple sites and interconnected by a communication network.

A portion or all of the methods disclosed herein may also be implementedby an application specific integrated circuit (ASIC), afield-programmable gate array (FPGA), a complex programmable logicdevice (CPLD), a printed circuit board (PCB), a digital signal processor(DSP), a combination of programmable logic components and programmableinterconnects, single central processing unit (CPU) chip, a CPU chipcombined on a motherboard, a general purpose computer, or any othercombination of devices or modules capable of constructing an intelligentknowledge base such as a question-answer knowledge base based onsemantic similarity calculation and/or abstract semantic recommendationdisclosed herein.

In the preceding specification, the invention has been described withreference to specific exemplary embodiments. It will however, be evidentthat various modifications and changes may be made without departingfrom the broader spirit and scope of the invention as set forth in theclaims that follow. The specification and drawings are accordingly to beregarded as illustrative rather than restrictive sense. Otherembodiments of the invention may be apparent to those skilled in the artfrom consideration of the specification and practice of the inventiondisclosed herein.

What is claimed is:
 1. A device for constructing an intelligentknowledge base, comprising: a preset knowledge subject obtaining unit toobtain a plurality of preset knowledge subjects from a subject domaindatabase, wherein each preset knowledge subject comprises a standardquestion and one or more extended questions; a receiving unit to receivean initial request message; a calculation unit, coupled to the presetknowledge subject obtaining unit and the receiving unit, to perform asemantic similarity calculation on the initial request message and theplurality of preset knowledge subjects to obtain a plurality of semanticsimilarity calculation results; a determination unit to determinewhether a largest one of the plurality of semantic similaritycalculation results is greater than a similarity threshold value; anabstract semantic expression obtaining unit to obtain a plurality ofabstract semantic expressions from an abstract semantic database,wherein each of the plurality of abstract semantic expressions comprisesa semantic-lacking element; an abstract semantic recommending module,coupled to the abstract semantic expression obtaining unit and thereceiving unit, to obtain one or more abstract semantic expressionscorresponding to the initial request message by performing, when thelargest one of the plurality of semantic similarity calculation resultsis smaller than the similarity threshold value, an abstract semanticrecommending process on the initial request message based on theplurality of abstract semantic expressions; a filling unit, coupled tothe abstract semantic recommending module, to extract from the initialrequest message an element corresponding to the semantic-lacking elementof the one or more abstract semantic expressions, and to fill theextracted element into the semantic-lacking element to obtain one ormore specific semantic expressions corresponding to the initial requestmessage; and a storing unit to: when the largest one of the plurality ofsemantic similarity calculation results is greater than the similaritythreshold value, store into the intelligent knowledge base the initialrequest message and the standard question and the one or more extendedquestions of a preset knowledge subject corresponding to the largest oneof the plurality of semantic similarity calculation results, or/and whenthe largest one of the plurality of semantic similarity calculationresults is smaller than the similarity threshold value, store theinitial request message and the one or more specific semanticexpressions into the intelligent knowledge base.
 2. The device accordingto claim 1, wherein the calculation unit employs one or more methods toperform the semantic similarity calculation, where the one or moremethods are selected from a group comprising: a calculation method basedon vector space model; a calculation method based on latent semanticindexing model; a semantic similarity calculation method based onattribute theory; and a semantic similarity calculation method based onHamming distance.
 3. The device according to claim 1, wherein thecalculation unit: performs the semantic similarity calculation betweenthe initial request message and the standard question; performs thesemantic similarity calculation between the initial request message andeach extended question; and defines the largest one of the calculationresults as a semantic similarity result of the initial request messageand the preset knowledge subject.
 4. The device according to claim 1,wherein the similarity threshold value ranges from 0.7 to 1.0.
 5. Thedevice according to claim 1, further comprising: a message preprocessingunit to preprocess on the initial request message to extractrepresentative features from messages to be updated, wherein therepresentative features are used as a basis of the semantic similaritycalculation.
 6. The device according to claim 1, wherein the abstractsemantic recommending module comprises: a word segmentation unit toperform a word segmentation process on the initial request message toobtain one or more single words; a rule word identification unit toidentify each single word as a semantic rule word or a non-semantic ruleword; a part-of-speech tagging unit to perform a part-of-speech taggingprocess on each non-semantic rule word to obtain part-of-speechinformation of the non-semantic rule word; a wordclass determinationunit to perform a wordclass determination process on each semantic ruleword to obtain wordclass information of the semantic rule word; asearching unit to perform a searching process on the abstract semanticexpressions to obtain an abstract semantic candidate set relevant to theinitial request message, wherein the abstract semantic candidate setcomprises a plurality of abstract semantic expressions; and a matchingunit to obtain an abstract semantic expression corresponding to theinitial request message by performing a matching process on the abstractsemantic expressions in the abstract semantic candidate set based on thepart-of-speech information and the wordclass information.
 7. The deviceaccording to claim 1, wherein the abstract semantic recommending modulecomprises: a word segmentation unit to perform a word segmentationprocess on the initial request message to obtain one or more singlewords; a part-of-speech tagging unit to performing a part-of-speechtagging process on each single word to obtain part-of-speech informationof each single word; a wordclass determination unit to perform awordclass determination process on each single word to obtain wordclassinformation of each single word; a searching unit to perform a searchingprocess on the abstract semantic database to obtain an abstract semanticcandidate set relevant to the initial request message, wherein theabstract semantic candidate set comprises a plurality of abstractsemantic expressions; and a matching unit to obtain one or more abstractsemantic expression corresponding to the initial request message byperforming a matching process on the abstract semantic expressions inthe abstract semantic candidate set based on the part-of-speechinformation and the wordclass information.
 8. The device according toclaim 7, wherein the abstract semantic recommending module furthercomprises: a number determination unit to determine, before the matchingprocess, whether a number of the abstract semantic expressions in theabstract semantic candidate set is within a predetermined range; aremoving unit to remove, when the number of the abstract semanticexpressions in the abstract semantic candidate set is above thepredetermined range, parts of the abstract semantic expressions; and asupplementing unit to supplement, when the number of the abstractsemantic expressions in the abstract semantic candidate set is under thepredetermined range, parts of abstract semantic expressions from adefault set.
 9. The device according to claim 6, wherein: the abstractsemantic expression comprises a semantic rule word; and at least partsof abstract rule words of the abstract semantic expressions in theabstract semantic candidate set are the same or belong to a samewordclass as parts of the single words of the initial request message.10. The device according to claim 9, wherein the abstract semanticexpression corresponding to the initial request message satisfies thefollowing conditions: the part-of-speech corresponding to thesemantic-lacking element comprises the part-of-speech of thecorresponding filling element; single words of the initial requestmessage except the filling element are the same or belong to a samewordclass as the abstract rule words; and the abstract semanticexpression has a same order as the initial request message.
 11. Thedevice according to claim 7, further comprising: a scoring unit to: whena number M of the abstract semantic expressions corresponding to theinitial request message is larger than a number N of the specificsemantic expressions, perform a scoring process on each abstractsemantic expression corresponding to the initial request message;wherein the filling unit further: extracts, from the initial requestmessage, elements corresponding to semantic-lacking elements of Nabstract semantic expressions having higher score; and obtains Nspecific semantic expressions corresponding to the initial requestmessage by filling the extracted elements into correspondingsemantic-lacking elements of the N abstract semantic expressions havinghigher score.
 12. The device according to claim 11, wherein the scoringunit performs a scoring process according to one or more methodsselected from a group comprising: setting a higher score to an abstractsemantic expression corresponding to the initial request message if anumber of the matched semantic-lacking elements is higher; defining asemantic-lacking element of the abstract semantic expression as a coresemantic element, and setting a higher score to the abstract semanticexpression if the semantic-lacking element is closer to the coresemantic element; setting a higher score to the abstract semanticexpression if a confidence value of the part-of-speech is higher;setting a higher score to the abstract semantic expression if a prioritylevel is higher, wherein the priority levels is pre-assigned to theabstract semantic expression; and setting a higher score to the abstractsemantic expression if a probability is higher, wherein the probabilityis determined based on a natural language model, and corresponds towhether data information obtained by filling segmentation words of alarge amount of crawled corpus data into the abstract semanticexpression has correct semantic information.
 13. The device according toclaim 1, further comprising: an answer providing unit to provide ananswer corresponding to the initial request message, wherein the storingunit stores the initial request message along with the provided answerinto the intelligent knowledge base.
 14. A computer-implemented methodfor constructing an intelligent knowledge base, comprising: obtaining,via an intelligent-knowledge-base constructing device, a plurality ofpreset knowledge subjects from a subject domain database, wherein eachpreset knowledge subject comprises a standard question and one or moreextended questions; receiving an initial request message; performing,via the intelligent-knowledge-base constructing device, a semanticsimilarity calculation on the initial request message and the pluralityof preset knowledge subjects to obtain a plurality of semanticsimilarity calculation results; determining whether a largest one of theplurality of semantic similarity calculation results is greater than asimilarity threshold value; obtaining a plurality of abstract semanticexpressions from an abstract semantic database, wherein each of theplurality of abstract semantic expressions comprises a semantic-lackingelement; obtaining one or more abstract semantic expressionscorresponding to the initial request message by performing, when thelargest one of the plurality of semantic similarity calculation resultsis smaller than the similarity threshold value, an abstract semanticrecommending process on the initial request message based on theplurality of abstract semantic expressions; extracting from the initialrequest message an element corresponding to the semantic-lacking elementof the one or more abstract semantic expressions; filling the extractedelement into the semantic-lacking element to obtain one or more specificsemantic expressions corresponding to the initial request message; andstoring, when the largest one of the plurality of semantic similaritycalculation results is greater than the similarity threshold value, intothe intelligent knowledge base the initial request message and thestandard question and the one or more extended questions of a presetknowledge subject corresponding to the largest one of the plurality ofsemantic similarity calculation results, or/and storing, when thelargest one of the plurality of semantic similarity calculation resultsis smaller than the similarity threshold value, the initial requestmessage and the one or more specific semantic expressions into theintelligent knowledge base.
 15. The method according to claim 14,wherein performing the semantic similarity calculation comprises:performing the semantic similarity calculation between the initialrequest message and the standard question; performing the semanticsimilarity calculation between the initial request message and eachextended question; and defining the largest one of the calculationresults as a semantic similarity result of the initial request messageand the preset knowledge subject.
 16. The method according to claim 14,further comprising: preprocessing on the initial request message toextract representative features from messages to be updated, wherein therepresentative features are used as a basis of the semantic similaritycalculation.
 17. The method according to claim 14, wherein the abstractsemantic recommending process on the initial request message comprises:performing a word segmentation process on the initial request message toobtain one or more single words; performing a part-of-speech taggingprocess on each single word to obtain part-of-speech information of eachsingle word; performing a wordclass determination process on each singleword to obtain wordclass information of each single word; performing asearching process on the abstract semantic database to obtain anabstract semantic candidate set relevant to the initial request message,wherein the abstract semantic candidate set comprises a plurality ofabstract semantic expressions; and obtaining one or more abstractsemantic expression corresponding to the initial request message byperforming a matching process on the abstract semantic expressions inthe abstract semantic candidate set based on the part-of-speechinformation and the wordclass information.
 18. The method according toclaim 17, wherein the abstract semantic recommending process on theinitial request message further comprises: determining, before thematching process, whether a number of the abstract semantic expressionsin the abstract semantic candidate set is within a predetermined range;removing, when the number of the abstract semantic expressions in theabstract semantic candidate set is above the predetermined range, partsof the abstract semantic expressions; and supplementing, when the numberof the abstract semantic expressions in the abstract semantic candidateset is under the predetermined range, parts of abstract semanticexpressions from a default set.
 19. The method according to claim 14,further comprising: after stopping receiving initial request messages,extracting at least parts of preset knowledge subjects that are notstored in the intelligent knowledge base, and storing the extractedpreset knowledge subjects into the intelligent knowledge base, whereinthe preset knowledge subjects comprise answers.
 20. A tangiblecomputer-readable-medium storing instructions that, when executed, causea computer to perform a method for constructing an intelligent knowledgebase, the method comprising: obtaining, via anintelligent-knowledge-base constructing device, a plurality of presetknowledge subjects from a subject domain database, wherein each presetknowledge subject comprises a standard question and one or more extendedquestions; receiving an initial request message; performing, via theintelligent-knowledge-base constructing device, a semantic similaritycalculation on the initial request message and the plurality of presetknowledge subjects to obtain a plurality of semantic similaritycalculation results; determining whether a largest one of the pluralityof semantic similarity calculation results is greater than a similaritythreshold value; obtaining a plurality of abstract semantic expressionsfrom an abstract semantic database, wherein each of the plurality ofabstract semantic expressions comprises a semantic-lacking element;obtaining one or more abstract semantic expressions corresponding to theinitial request message by performing, when the largest one of theplurality of semantic similarity calculation results is smaller than thesimilarity threshold value, an abstract semantic recommending process onthe initial request message based on the plurality of abstract semanticexpressions; extracting from the initial request message an elementcorresponding to the semantic-lacking element of the one or moreabstract semantic expressions; filling the extracted element into thesemantic-lacking element to obtain one or more specific semanticexpressions corresponding to the initial request message; and storing,when the largest one of the plurality of semantic similarity calculationresults is greater than the similarity threshold value, into theintelligent knowledge base the initial request message and the standardquestion and the one or more extended questions of a preset knowledgesubject corresponding to the largest one of the plurality of semanticsimilarity calculation results, or/and storing, when the largest one ofthe plurality of semantic similarity calculation results is smaller thanthe similarity threshold value, the initial request message and the oneor more specific semantic expressions into the intelligent knowledgebase.